Supplementary of Multi-scale Deep Learning Architectures for Person Re-identification

نویسندگان

  • Xuelin Qian
  • Yanwei Fu
  • Yu-Gang Jiang
  • Tao Xiang
  • Xiangyang Xue
چکیده

Multi-scale-A layer (Fig. 1), analyses the data stream with the size 1 × 1, 3 × 3 and 5 × 5 of receptive field. Furthermore, in order to increase both depth and width of this layer, we split the filter size of 5 × 5 into two 3 × 3 streams cascaded (i.e. stream-4 and stream-3 in Tab 1 and Fig. 1). The weights of each stream are also tied with the corresponding stream in another branch. Such a design art is, in general, inspired by, and yet different from the inception architectures [11, 12, 10]. The key difference lies in the weights which are not tied between any two streams from the same branch, but are tied between the two corresponding streams of different branches. Reduction layer (Fig. 2) further passes the data stream in multi-scale, and halves the width and height of feature maps, which should be, in principle, reduced from 78× 28 to 39 × 14. We thus employ Reduction layer to gradually decrease the size of feature representations as illustrated in Tab 1 and Fig. 2, in order to avoid representation bottlenecks. Here we follow the design principle of “avoid representational bottlenecks” [12]. In contrast to directly use max-pooling layer for decreasing feature map size, our ablation study shows that the Reduction layer, if replaced by max-pooling layer, will leads to more than 10% absolute points lower than the reported results of Rank-1 accuracy on CUHK01 dataset. Again, the weights of each filter here are also tied for paired streams. Multi-scale-B layer (Fig. 3) serves as the last stage of highlevel features extraction for the multiple scales of 1 × 1, 3 × 3 and 5 × 5. Besides splitting the 5 × 5 stream into two 3 × 3 streams cascaded (i.e. stream-4 and stream-3 in Tab 1 and Fig. 3). We can further decompose the 3 × 3 C-filters into one 1 × 3 C-filter followed by 3 × 1 C-filter [10]. This leads to several benefits, including reducing the computation cost on 3 × 3 C-filters, further increasing the

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-Channel Pyramid Person Matching Network for Person Re-Identification

In this work, we present a Multi-Channel deep convolutional Pyramid Person Matching Network (MC-PPMN) based on the combination of the semantic-components and the colortexture distributions to address the problem of person reidentification. In particular, we learn separate deep representations for semantic-components and color-texture distributions from two person images and then employ pyramid ...

متن کامل

Person Re-identification: Past, Present and Future

Person re-identification (re-ID) has become increasingly popular in the community due to its application and research significance. It aims at spotting a person of interest in other cameras. In the early days, hand-crafted algorithms and small-scale evaluation were predominantly reported. Recent years have witnessed the emergence of large-scale datasets and deep learning systems which make use ...

متن کامل

Multi-pseudo Regularized Label for Generated Samples in Person Re-Identification

Sufficient training data is normally required to train deeply learned models. However, the number of pedestrian images per ID in person re-identification (re-ID) datasets is usually limited, since manually annotations are required for multiple camera views. To produce more data for training deeply learned models, generative adversarial network (GAN) can be leveraged to generate samples for pers...

متن کامل

Transferable Joint Attribute-Identity Deep Learning for Unsupervised Person Re-Identification

Most existing person re-identification (re-id) methods require supervised model learning from a separate large set of pairwise labelled training data for every single camera pair. This significantly limits their scalability and usability in real-world large scale deployments with the need for performing re-id across many camera views. To address this scalability problem, we develop a novel deep...

متن کامل

Tracking by Prediction: A Deep Generative Model for Mutli-Person localisation and Tracking

Current multi-person localisation and tracking systems have an over reliance on the use of appearance models for target re-identification and almost no approaches employ a complete deep learning solution for both objectives. We present a novel, complete deep learning framework for multi-person localisation and tracking. In this context we first introduce a light weight sequential Generative Adv...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017